14 research outputs found
Scalable visualisation methods for modern Generalized Additive Models
In the last two decades the growth of computational resources has made it
possible to handle Generalized Additive Models (GAMs) that formerly were too
costly for serious applications. However, the growth in model complexity has
not been matched by improved visualisations for model development and results
presentation. Motivated by an industrial application in electricity load
forecasting, we identify the areas where the lack of modern visualisation tools
for GAMs is particularly severe, and we address the shortcomings of existing
methods by proposing a set of visual tools that a) are fast enough for
interactive use, b) exploit the additive structure of GAMs, c) scale to large
data sets and d) can be used in conjunction with a wide range of response
distributions. All the new visual methods proposed in this work are implemented
by the mgcViz R package, which can be found on the Comprehensive R Archive
Network
qgam: Bayesian non-parametric quantile regression modelling in R
Generalized additive models (GAMs) are flexible non-linear regression models,
which can be fitted efficiently using the approximate Bayesian methods provided
by the mgcv R package. While the GAM methods provided by mgcv are based on the
assumption that the response distribution is modelled parametrically, here we
discuss more flexible methods that do not entail any parametric assumption. In
particular, this article introduces the qgam package, which is an extension of
mgcv providing fast calibrated Bayesian methods for fitting quantile GAMs
(QGAMs) in R. QGAMs are based on a smooth version of the pinball loss of
Koenker (2005), rather than on a likelihood function, hence jointly achieving
satisfactory accuracy of the quantile point estimates and coverage of the
corresponding credible intervals requires adopting the specialized Bayesian
fitting framework of Fasiolo, Wood, Zaffran, Nedellec, and Goude (2020b). Here
we detail how this framework is implemented in qgam and we provide examples
illustrating how the package should be used in practice
Fast calibrated additive quantile regression
We propose a novel framework for fitting additive quantile regression models,
which provides well calibrated inference about the conditional quantiles and
fast automatic estimation of the smoothing parameters, for model structures as
diverse as those usable with distributional GAMs, while maintaining equivalent
numerical efficiency and stability. The proposed methods are at once
statistically rigorous and computationally efficient, because they are based on
the general belief updating framework of Bissiri et al. (2016) to loss based
inference, but compute by adapting the stable fitting methods of Wood et al.
(2016). We show how the pinball loss is statistically suboptimal relative to a
novel smooth generalisation, which also gives access to fast estimation
methods. Further, we provide a novel calibration method for efficiently
selecting the 'learning rate' balancing the loss with the smoothing priors
during inference, thereby obtaining reliable quantile uncertainty estimates.
Our work was motivated by a probabilistic electricity load forecasting
application, used here to demonstrate the proposed approach. The methods
described here are implemented by the qgam R package, available on the
Comprehensive R Archive Network (CRAN)
qgam: Bayesian Nonparametric Quantile Regression Modeling in R
Generalized additive models (GAMs) are flexible non-linear regression models, which can be fitted efficiently using the approximate Bayesian methods provided by the mgcv R package. While the GAM methods provided by mgcv are based on the assumption that the response distribution is modeled parametrically, here we discuss more flexible methods that do not entail any parametric assumption. In particular, this article introduces the qgam package, which is an extension of mgcv providing fast calibrated Bayesian methods for fitting quantile GAMs (QGAMs) in R. QGAMs are based on a smooth version of the pinball loss of Koenker (2005), rather than on a likelihood function, hence jointly achieving satisfactory accuracy of the quantile point estimates and coverage of the corresponding credible intervals requires adopting the specialized Bayesian fitting framework of Fasiolo, Wood, Zaffran, Nedellec, and Goude (2021b). Here we detail how this framework is implemented in qgam and we provide examples illustrating how the package should be used in practice
Conformal Prediction with Missing Values
International audienceConformal prediction is a theoretically grounded framework for constructing predictive intervals. We study conformal prediction with missing values in the covariates-a setting that brings new challenges to uncertainty quantification. We first show that the marginal coverage guarantee of conformal prediction holds on imputed data for any missingness distribution and almost all imputation functions. However, we emphasize that the average coverage varies depending on the pattern of missing values: conformal methods tend to construct prediction intervals that undercover the response conditionally to some missing patterns. This motivates our novel generalized conformalized quantile regression framework, missing data augmentation, which yields prediction intervals that are valid conditionally to the patterns of missing values, despite their exponential number. We then show that a universally consistent quantile regression algorithm trained on the imputed data is Bayes optimal for the pinball risk, thus achieving valid coverage conditionally to any given data point. Moreover, we examine the case of a linear model, which demonstrates the importance of our proposal in overcoming the heteroskedasticity induced by missing values. Using synthetic and data from critical care, we corroborate our theory and report improved performance of our methods
Adaptive Conformal Predictions for Time Series
Uncertainty quantification of predictive models is crucial in decision-making problems. Conformal prediction is a general and theoretically sound answer. However, it requires exchangeable data, excluding time series. While recent works tackled this issue, we argue that Adaptive Conformal Inference (ACI, Gibbs and Candès, 2021), developed for distribution-shift time series, is a good procedure for time series with general dependency. We theoretically analyse the impact of the learning rate on its efficiency in the exchangeable and auto-regressive case. We propose a parameter-free method, AgACI, that adaptively builds upon ACI based on online expert aggregation. We lead extensive fair simulations against competing methods that advocate for ACI's use in time series. We conduct a real case study: electricity price forecasting. The proposed aggregation algorithm provides efficient prediction intervals for day-ahead forecasting. All the code and data to reproduce the experiments is made available
Adaptive Conformal Predictions for Time Series
International audienceUncertainty quantification of predictive models is crucial in decision-making problems. Conformal prediction is a general and theoretically sound answer. However, it requires exchangeable data, excluding time series. While recent works tackled this issue, we argue that Adaptive Conformal Inference (ACI, Gibbs and Candès, 2021), developed for distribution-shift time series, is a good procedure for time series with general dependency. We theoretically analyse the impact of the learning rate on its efficiency in the exchangeable and auto-regressive case. We propose a parameter-free method, AgACI, that adaptively builds upon ACI based on online expert aggregation. We lead extensive fair simulations against competing methods that advocate for ACI's use in time series. We conduct a real case study: electricity price forecasting. The proposed aggregation algorithm provides efficient prediction intervals for day-ahead forecasting. All the code and data to reproduce the experiments is made available